|
In machine learning and data mining, a string kernel is a kernel function that operates on strings, i.e. finite sequences of symbols that need not be of the same length. String kernels can be intuitively understood as functions measuring the similarity of pairs of strings: the more similar two strings ''a'' and ''b'' are, the higher the value of a string kernel ''K''(''a'', ''b'') will be. Using string kernels with kernelized learning algorithms such as support vector machines allow such algorithms to work with strings, without having to translate these to fixed-length, real-valued feature vectors.〔 String kernels are used in domains where sequence data are to be clustered or classified, e.g. in text mining and gene analysis.〔 〕 ==Informal introduction== Suppose one wants to compare some text passages automatically and indicate their relative similarity. For many applications, it might be sufficient to find some keywords which match exactly. One example where exact matching is not always enough is found in spam detection.〔 〕 Another would be in computational gene analysis, where homologous genes have mutated, resulting in common subsequences along with deleted, inserted or replaced symbols. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「String kernel」の詳細全文を読む スポンサード リンク
|